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(57) Abstract: The present invention provides a method and apparatus for detecting and managing the state of a computer network 
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ment, a network status table is employed in each node to manage data related to the network state between the node and other nodes in 
the network- In various embodiments, rerouting of data is managed independently such that a communication path is independently 
selected for sending data from a node to a connected node and for receiving data from the connected node. The invention in some 
embodiments is operable to route data through one or more intermediate nodes where direct connection between a pair of nodes is 
not possible. 
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Multiple Network Fault Tolerance via Redundant Network Control 

Field of the Invention 

The invention relates generally to computer networks, and more specifically 
to a method and apparatus providing a fault-tolerant network having a redundant 
connection to network nodes able to detect and recover from multiple network 
faults. 

Notice of Copending Applications 

This application is related to the following copending applications, which 
are hereby incorporated by reference: 

"Fault Tolerant Networking", serial number 09/1 88,976; and 
Atty. docket number 256.045usl 

Background of the Invention 

Computer networks have become increasingly important to communication 
and productivity in environments where computers are utilized for work. 
Electronic mail has in many situations replaced paper mail and faxes as a means of 
distribution of information, and the availability of vast amounts of information on 
the Internet has become an invaluable resovirce both for many work-related and 
personal tasks. The ability to exchange data over computer networks also enables 
sharing of computer resources such as printers in a work environment, and enables 
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centralized network-based management of the networked computers. 

For example, an office worker's personal computer may run software that is 
installed and updated automatically via a network, and that generates data that is 
printed to a networked printer shared by people in several different offices. The 
network may be used to inventory the software and hardware installed in each 
personal computer, greatly simplifying the task of inventory management. Also, 
the software and hardware configuration of each computer may be managed via the 
network, making the task of user support easier in a networked enviroimient. 

Networked computers also typically are coimected to one or more network 
servers that provide data and resources to the networked computers. For example, 
a server may store a nimiber of software applications that can be executed by the 
networked computers, or may store a database of data that can be accessed and 
utilized by the networked computers. The network servers typically also manage 
access to certain networked devices such as printers, which can be utilized by any 
of the networked computers. Also, a server may facilitate exchange of data such as 
e-mail or other similar services between the networked computers. 

Cormection from the local network to a larger network such as tlie Internet 
can provide greater ability to exchange data, such as by providing Intemet e-mail 
access or access to the World Wide Web. These data cormections make 
conducting business via the Intemet practical, and have contributed to the growth 
m development and use of computer networks. Intemet servers that provide data 
and serve functions such as e-commerce, streaming audio or video, e-mail, or 
provide other content rely on the operation of local networks as well as the Intemet 
to provide a path between such data servers and client computer systems. 

2 
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But like other electronic systems, networks are subject to failures. 
Misconfiguration, broken wires, failed electronic components, and a number of 
other factors can cause a computer network connection to fail, leading to possible 
inoperability of the computer network. Such failures can be minimized in critical 
networking environments such as process control, medical, or other critical 
applications by utilization of backup or redundant network components. One 
example is use of a second network connection linking critical network nodes 
providing the same function as the first network connection. But, management of 
the network connections to facilitate operation in the event of a network failure can 
be a difficult task, and is itself subject to the ability of a network system or user to 
properly detect and compensate for the network fault. Furthermore, when both a 
primary and redmidant network develop faults, exclusive use of either network will 
not provide full network operability. What is needed is a method and apparatus to 
detect and manage the state of a network of computers utilizing redundant 
communication chaimels. 

Summary of the Invention 

The present invention provides a method and apparatus for detecting and 
managing the state of a computer network comprising network nodes with 
redundant network coimections, and for recovering from multiple network faults. 
In one embodiment, a network status table is employed in each node to manage 
data related to the network state between the node and other nodes in the network. 
In various embodiments, reroutmg of data is managed independently such that a 
communication path is independently selected for sending data from a node to a 
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connected node and for receiving data from the connected node. The invention in 
some embodiments is operable to route data through one or more intermediate 
nodes where direct connection between a pair of nodes is not possible. 

Brief Description of the Figures 

Figure 1 shows a diagram of a computer network with multiple nodes 
having primary and redxmdant network connections, consistent with an 
embodiment of the present invention. 

Figure 2 shows an example of a network status table, consistent with an 
embodiment of the present invention. 

Figure 3 shows a flowchart of a method of managing the state of a network 
of nodes having primary and redimdant network connections, consistent with an 
embodiment of the present invention. 

Detailed Description 

In the following detailed description of sample embodiments of the 
invention, reference is made to the accompanying drawings which form a part 
hereof, and in which is shown by way of illustration specific sample embodiments 
in. which the invention may be practiced. These embodiments are described in 
sufficient detail to enable those skilled in the art to practice the invention, and it is 
to be understood that other embodiments may be utilized and that logical, 
mechanical, electrical, and other changes may be made without departing from tlie 
spirit or scope of the present invention. The following detailed description is, 
therefore, not to be taken in a limiting sense, and the scope of the invention is 
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defined only by the appended claims. 

The present invention provides a method and an apparatus for detecting and 
managing the state of network connections to facilitate operation of a redimdant 
network in the event of a network failure. The invention is capable of 
compensating for multiple network faults, including faults in both the primary and 
the redxmdant network. In some embodiments, the invention selects either tlie 
primary or the redundant network connection for communicating data between 
each pair of network nodes, such that the network may continue to be fully 
operational so long as at least one connection is operable to transmit data and one 
connection is operable to receive data between each pziir of network nodes. 

The invention in various forms is implemented using an existing network 
technology, such as Ethernet. In one such embodiment, two connections between 
each node are made via Ethernet connections — a primary network connection and 
a redimdant network connection. In some such embodiments, off-the-shelf 
network adapters are utilized, and the invention controls the operation of the 
network adapters and manages communication via software executing on the 
computerized nodes. It is not critical for purposes of the invention which 
comiection is the primary connection and which is the redundant connection, as the 
coimections are physically and functionedly similar. In the example embodiment 
discussed here, the primary and redundant network connections are interchangeable 
and are assigned names primarily for the purpose of distinguishing the networks 
from each other. 

Figure 1 illustrates an exemplary network Avith four nodes 101, 102, 103 
and 104. A primary network 105 and a redxmdant network 106 links each node to 
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the other nodes of the network, as indicated by the directional lines connecting the 
nodes to each of the networks. To understand how the invention is operable to 
compensate for multiple network failures, the coimection from node 3 at 103 to 
primary network 105 is broken such that node 3 cannot transmit data to network 
105 as shown at 107. Also, the connections linking node 4 at 104 to the redundant 
bus 106 are broken such that node 4 cannot receive data from the redundant bus as 
shown at 108 and cannot transmit data to the redundant bus as shown at 109. 

In a typical redundant network system, failure of a single connection 
between the primary network and a node such as is shown at 107 would cause all 
nodes on the network to switch to communicating via the redundant bus 106. In 
the network configuration shown in Figure 1, connections between node 4 and the 
redimdant bus are also inoperable, making operation of the network using the 
redimdant bus impossible. Such multiple failures make the network inoperable 
when exclusively using either the primary or redundant bus. 

The present invention provides a solution to this problem and enables 
coromunication between all network nodes during multiple failures such as are 
shown in Figure 1 by use of network status data and intelligent routing of data. In 
some embodiments of the invention, the network status data is stored in a network 
status table as shown in Figure 2. 

Figure 2 illustrates an example of a network status table for node 3 of the 
network of Figure 1, and contains data indicating the ability of node 3 to receive 
data from other nodes and the ability of other nodes to receive data from node 3. 
Specifically, the "Received Data OK" columns indicate the ability of node 3 to 
receive data from each of nodes 1, 2 and 4 on both the primary and redundant 
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networks. The table indicates with an "X" that node 3 cannot receive data from 
node 4 over the redundant network connection, and indicates that node 3 can 
receive data from all other nodes via both the primary and redundant network 
connections with an "OK". The "X" indicating node 3's inability to receive data 
from node 4 is the result of the broken data transmit connection 109 between the 
redimdant network 106 and node 4 (104). 

The "Other Node Report Data" colunms represent the data reported to node 
3 by other nodes regarding the ability of the various other nodes to receive data 
from node 3. Because node 3*s connection to the primary network 105 is broken at 
107 such that node 3 cannot send data over the connection, nodes 1, 2 and 4 are 
imable to receive data from node 3 on the primary network and so an "X" indicates 
a node 3 failure for each of these nodes. Also, the data connection between node 4 
and the redundant network is broken at 108 such that node 4 cannot receive data 
from the redundant network, so an "X" also indicates that node 4 is unable to 
receive data from node 3 in the node "4" colimm of the "Node 3 Redundant" row. 

The determination of whetlier a node can receive data from another node is 
made in various embodiments using special-purpose diagnostic data signals, using 
network protocol signals, or using any other suitable type of data sent between 
nodes. The data each node provides to other nodes to populate the "Other Node 
Report Data" must necessarily be data which includes the data to be communicated 
between nodes, and is in one embodiment a special-purpose diagnostic data signal 
comprising the node data to be reported. 

From the data in the network status table of Figure 2, the state of the 
various network connections can be determined and a suitable connection for 
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communication between each pair of network nodes can be selected. In the 
example of Figures 1 and 2, nodes 1 and 2 are fully operational and may use either 
connection to communicate, and nodes 3 and 4 each have a fiiUy operational 
connection to either the primary or redundant networks. Therefore, only nodes 3 
and 4 are imable to communicate over either the primary or redundant network 
exclusively. Node 3 cannot send data to the primary network^ and node 4 cannot 
send or receive data from the redundant network, but node 3 can receive data from 
node 4 via the primary network, Li some embodiments of the invention, node 3 
cannot send data to node 4 because no operable direct path over either the primar)'^ 
or redundant networks exists to send data. 

In other embodiments of the invention, node 3 may transmit the data to 
node 4 via another node with an "OK" indication for either network in the "Other 
Node Report Data" rows of the table such as node 1 or node 2. In such 
embodiments, the "OK" nodes or intermediate nodes are known to be able to 
receive data from node 3, and can retransmit the data to node 4 via their fully 
functional primary network connections. This allows communication between two 
nodes where multiple network failures prevent direct communication between two 
nodes. In further embodiments, the intermediate node to which the data is routed is 
selected via polling the intermediate nodes to select a node that indicates it is able 
to retransmit data to node 4 by evaluation of the data in each of the intermediate 
nodes' network status table. In various embodiments of the invention, the 
intermediate nodes may comprise networked computers as in the example above, 
may comprise a direct coimection between networks, may comprise a router or 
bridge, may comprise a special-purpose intermediate node hardware device, or may 
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be implemented in any other way that provides the ability to suitably commimicate 
signals between the two networks. 

Figure 3 is a flowchart illustrating a method of practicing one embodiment 
of the present invention. At 301, each node determines the state of the primary 
network connection linking it to each other node. Also, the state of the redundant 
network connection linking each node to each other node is determined at 302. 
The state of the primaiy and redimdant connections between each pair of nodes can 
is determined in various embodiments by searching the connections for existing 
data such as valid data or protocol packets, or by use of special -purpose diagnostic 
messages. This network connection state data is used at 303 to build the "Received 
Data OK" portion of a network status table for each node, and the nodes exchange 
data vnth each other at 304 to complete the "Other Node Report Data" portion of 
tlie network status table. The network status table is updated regularly, and is 
monitored at 305 to determine whether a network connection has failed and 
requires rerouting of data. 

At 306, the node determines by examination of the network status table 
whether a direct connection for transmitting and receiving data between the pair of 
nodes with a failed connection can be made. If a coxmection can be made, such as 
by transmitting data via the primary network connection and receiving data through 
the redundant network connection, the data is rerouted trough the direct 
connections at 307 and monitoring for additional failures resumes at 305. If a 
direct connection cannot be made, data is rerouted through one or more 
intermediate nodes at 308 to faciUtate communication, as was described in 
accordance with the multiple network failure example illustrated in Figures 1 and 
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2. Again, once a data path through one or more intermediate nodes has been 
selected monitoring for additional network failures resumes at 305. 

The present invention provides a method and apparatus that enable a 
network with primary and redundant network connections to manage routing of 
data through the network such that multiple network failures can be compensated 
for. In some embodiments, the invention includes rerouting data that cannot be 
transferred directly between two nodes to intermediate nodes which are able to 
facilitate communication between the nodes. The invention also incorporates 
constmction and use of a network status table in some embodiments for managing 
data related to the network state. The invention includes in various embodiments a 
method for managing the state of the network, software for execution on a 
computer for managing the state of the network, and a hardware network interface 
that is operable to manage the state of the network. 

Although specific embodiments have been illustrated and described herein, 
it will be appreciated by those of ordinary skill in the art that any arrangement 
which is calculated to achieve the same purpose may be substituted for the specific 
embodiments shown. This application is intended to cover any adaptations or 
variations of the invention. It is intended that this invention be limited only by the 
claims, and the full scope of equivalents thereof. 



10 



BNSDOCIDr <WO 0163850A1„I„> 



wo 01/63850 



PCT/USO 1/05834 



Claims 

1 . A method of managing the state of a computer network with redundant network 
connections, comprising: 

determining the state of a primary network connection between each pair of 
networked nodes; 

determining the state of a redimdant network connection between each pair 
of networked nodes; and 

selecting either the primary network connection or the redundant network 
connection for sending and receiving data between each pair of networked nodes, 
such that the network path selected to be used to communicate is selected 
independently based on the detemiined network states for each pair of networked 
nodes. 

2. The method of claim 1, further comprising building a network status table that 
indicates results of determiiiing the state of primary and redundant network 
connections between each pair of networked nodes. 

3. The method of claim 2, wherein the network status table comprises data 
representing network status based on data received at a node from other network 
nodes. 

4. The method of claim 3, wherein the data received at a node from other 
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5. The method of claim 4, wherein the data received at a node from other 
networked nodes comprises data representing the ability of the other nodes to 
receive data from other different network nodes. 

6. The method of claim 2, wherein the network status table comprises data 
representing network status based on a node's ability to send data to other nodes. 

7. The method of claim 3, wherein the network status table further comprises data 
representing network status based on a node's ability to send data to other nodes. 

8. The method of clahn 1, wherein selecting the primary or redundant network 
connection for communication between each pair of networked nodes comprises: 

selecting the primary network connection if the state of the primary network 
connection is determined to be operable; and 

selecting the redundant network connection if the state of the primary 
network connection is determined to be inoperable. 

9. The method of claim 1, wherein selecting the primary or redimdant network 
connection for communication between each pair of networked nodes comprises: 

selecting the primary network coimection to transmit data if the state of the 
primary network connection is determined to be operable to transmit data; 

selecting the primary network connection to receive data if the state of the 
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primary network connection is determined to be operable to receive data; 

selecting the redundant network connection to transmit data if the state of 
the primary network connection is determined to be inoperable to transmit data; 
and 

selectmg the redundant network connection to receive data if the state of the 
primary network connection is determined to be inoperable to receive data. 

10. The method of claun 1, wherein selecting a connection for sending and 
receiving data between each pair of network nodes comprises selecting a 
connection for sending and receiving data from a first node to one or more 
connected intemiediate nodes, and selecting a connection for sending and receiving 
data from an intermediate node to a second node. 

11. A computer network interface, the interface operable to: 

determine the state of a primary network connection between the network 
interface and the network interfaces of other network nodes; 

determine the state of a redundant network connection between the network 
interface and the network interfaces of other network nodes; and 

select either the primary network connection or the redundant network 
connection for commxmication with each of the other network nodes, such that the 
network connection selected is selected independently based on the determined 
network states for each other network node. 

12. The computer network interface of claim 1 1, the interface further comprising a 
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network status table that indicates results of the determination of the state of the 
primary and redxmdant network connections between the computer network 
interface and the network interfaces of other network nodes. 

13. The computer network interface of claim 12, wherein the network status table 
comprises data representing network status based on data received at a node from 
other network nodes. 

14. The computer network interface of claim 13, wherein the data received at a 
node from other networked nodes comprises a diagnostic message. 

15. The computer network interface of claim 14, wherein the data received at a 
node from other networked nodes further comprises data representing the ability of 
the other nodes to receive data from other different network nodes. 

16. The computer network interface of claim 12, wherein the network status table 
comprises data representing network status based on a node's ability to send data to 
other nodes. 

1 7. The computer network interface of claim 13, wherein the network status table 
further comprises data representing network status based on a node's ability to send 
data to other nodes. 

18. The computer network interface of claim 11, wherein selecting either the 
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primary network connection or the redundant network connection for 
communication with each of the other network nodes comprises: 

selecting the primary network connection if the state of the primary network 
connection is determined to be operable; and 

selecting the redundant network connection if the state of the primary 
network connection is determined to be inoperable. 

19. The computer network interface of claim 11, wherein selecting either the 
primary network connection or the redundant network connection for 
communication with each of the other network nodes comprises: 

selecting the primary network connection to transmit data if the state of the 
primary network connection is determined to be operable to transmit data; 

selecting the primary network connection to receive data if the state of the 
primary network connection is determined to be operable to receive data; 

selecting the redundant network connection to transmit data if the state of 
the primary network connection is determined to be inoperable to transmit data; 
and 

selecting the redundant network connection to receive data if the state of the 
primary network cormection is determined to be inoperable to receive data. 

20. The computer network interface of claim 1 1 , wherein selecting a connection 
for sending and receiving data between each pair of network nodes comprises 
selecting a cormection for sending and receiving data from a jBrst node to one or 
more connected intermediate nodes, and selecting a connection for sending and 
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receiving data from an intermediate node to a second node. 

21 . A machine-readable medium with instructions thereon, the instructions when 
executed on a computer operable to cause the computer to: 

determine the state of a primary network connection between the network 
interface and the network interfaces of other network nodes; 

determine the state of a redxmdant network connection between the network 
interface and the network interfaces of other network nodes; and 

select either the primary network connection or the redxmdant network 
connection for communication with each of the other network nodes, such that the 
network connection selected is selected independently based on the determined 
network states for each other network node. 

22. The machine-readable medium of claim 21, the instructions further operable to 
cause a computer to create and maintain a network status table that indicates results 
of the determination of the state of the primary and redundant network cormections 
between the computer network interface and the network interfaces of other 
network nodes. 

23. The macliine-readable medium of claim 22, wherein the created network status 
table comprises data representing network status based on data received at a node 
from other network nodes. 

24. The machine-readable medium of claim 23, wherein the data received at a 
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node jfrom other networked nodes comprises a diagnostic message. 

25. The machine-readable medium of claim 24, wherein the data received at a 
node from other networked nodes further comprises data representing the ability of 
the other nodes to receive data from other different network nodes. 

26. The machine-readable medium of claim 22, wherein the created network status 
table comprises data representing network status based on a node's ability to send 
data to other nodes. 

27. The machine-readable medium of claim 23, wherein the network status table 
ftirther comprises data representing network status based on a node's ability to send 
data to other nodes. 

28. The machine-readable mediimi of claim 21, wherein selecting either the 
primary network connection or the redundant network connection for 
communication with each of the other network nodes comprises: 

selecting tlie primary network connection if the state of the primary network 
connection is determined to be operable; and 

selecting the redundant network connection if the state of the primary 
network connection is determined to be inoperable. 

29. The machine-readable medium of claim 21, wherein selecting either the 
primary network connection or the redundant network connection for 
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communication with each of the other network nodes comprises: 

selecting the primary network connection to traasmit data if the state of the 

prim.ary network connection is determined to be operable to transmit data; 

selecting the primary network connection to receive data if the state of the 

primary network connection is determined to be operable to receive data; 

selecting the redundant network connection to transmit data if the state of 

the primary network connection is determined to be inoperable to transmit data; 

and 

selecting the redundant network connection to receive data if the state of the 
primaiy network connection is determined to be inoperable to receive data. 

30. The machine-readable medium of claim 21, wherein selecting a connection for 
sending and receiving data between each pair of network nodes comprises selecting 
a connection for sending and receiving data from a first node to one or more 
connected intermediate nodes, and selecting a connection for sending and receiving 
data from an intermediate node to a second node. 
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